The ggplot2 package is based on the principle that all plots consist of a few basic components: data, a coordinate system and a visual representation of the data. In ggplot2, you built plots incrementally, starting with the data and coordinates you want to use and then specifying the graphical features: lines, points, bars, color, etc.
We will be using three datasets in this tutorial. First is our main dataset on phage host interaction, second is the diamond dataset given by the ggplot package, lastly is the carbon dioxide dataset given by R.
For a list of preconfigured datasets, simply type data().
data()
dataset <- read.delim("phages.tsv")
Before we begin, load the ggplot2 package for R. ggplot2 is a graphics package that provides powerful plotting capabilities beyond R’s base plotting functions. We won’t actually get into ggplot2 itself quite yet. This will be a basic introduction to plotting in R.
library(ggplot2)
dataset
A histogram is a univariate plot (a plot that displays one variable) that groups a numeric variable into bins and displays the number of observations that fall within each bin. A histogram is a useful tool for getting a sense of the distribution of a numeric variable.
hist(dataset$Positive.Strand....)
Note: When you create a plot in a local RStudio environment, it will appear in the bottom right pane under the “plots” tab. Use the left and right arrows to cycle through the plots you’ve created.
Boxplots are another type of univariate plot for summarizing distributions of numeric data graphically.
boxplot(dataset$molGC...)
The central box of the boxplot represents the middle 50% of the observations, the central bar is the median and the bars at the end of the dotted lines encapsulate the great majority of the observations. Circles that lie beyond the end of the whiskers are data points that may be outliers.
One of the most useful features of the boxplot() function is the ability to make side-by-side boxplots. A side-by-side boxplot takes a numeric variable and splits it on based on some categorical variable, drawing a different boxplot for each level of the categorical variable.
boxplot(dataset$molGC... ~ dataset$Molecule) # Plot GC content split based on molecular type
A density plot shows the distribution of a numeric variable with a continuous curve. It is similar to a histogram but without discrete bins, a density plot gives a better picture of the underlying shape of a distribution.
plot(density(dataset$molGC...))
Barplots are graphs that visually display counts of categorical variables.
dataset$Jumbophage <- ifelse(dataset$Jumbophage, "Jumbophage", "Not Jumbophage")
dataset$Jumbophage
[1] "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage"
[10] "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage"
[19] "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[28] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[37] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[46] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[55] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[64] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[73] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[82] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[91] "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage"
[100] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[109] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[118] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[127] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[136] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[145] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[154] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[163] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[172] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[181] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[190] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[199] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[208] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[217] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[226] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[235] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[244] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[253] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[262] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[271] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[280] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[289] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[298] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[307] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[316] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[325] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[334] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[343] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[352] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[361] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[370] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[379] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[388] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[397] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[406] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[415] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[424] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[433] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[442] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[451] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[460] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[469] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[478] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[487] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[496] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[505] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[514] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"
[523] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[532] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[541] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[550] "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[559] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[568] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[577] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[586] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[595] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[604] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[613] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[622] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[631] "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[640] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[649] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[658] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Not Jumbophage"
[667] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[676] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[685] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[694] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[703] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[712] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[721] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[730] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage"
[739] "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[748] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[757] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[766] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[775] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[784] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[793] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[802] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[811] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[820] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[829] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[838] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[847] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[856] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[865] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage"
[874] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage"
[883] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[892] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[901] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[910] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[919] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Jumbophage"
[928] "Not Jumbophage" "Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[937] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[946] "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[955] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[964] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[973] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[982] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[991] "Not Jumbophage" "Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[1000] "Not Jumbophage"
[ reached getOption("max.print") -- omitted 17406 entries ]
barplot(table(dataset$Molecule))
barplot(table(dataset$Jumbophage, dataset$Molecule),
legend = levels(dataset$Jumbophage)
)
A grouped barplot is an alternative to a stacked barplot that gives each stacked section its own bar. To make a grouped barplot, create a stacked barplot and add the extra argument beside = TRUE.
barplot(table(dataset$Jumbophage, dataset$Molecule),
legend = levels(dataset$Jumbophage),
beside = TRUE
) # Group instead of stacking
Scatterplots are bivariate (two variable) plots that take two numeric variables and plot data points on the x/y plane.
plot(dataset$molGC..., dataset$Positive.Strand....)
plot(dataset$molGC...,
dataset$Positive.Strand....,
col = rgb(red = 0, green = 0, blue = 0, alpha = 0.1)
)
Illustrating how we can make our different plots look more presentable.
barplot(table(dataset$Jumbophage, dataset$Molecule),
legend = levels(dataset$Jumbophage),
beside = TRUE,
xlab = "Molecular Type",
ylab = "Jumbophage",
main = "Molecular Type, Grouped by Jumbophage",
col = c(
"#FFFFFF", "#F5FCC2", "#E0ED87", "#CCDE57", # Add color*
"#B3C732", "#94A813", "#718200"
)
)
The ggplot() function creates plots incrementally in layers. Every ggplot starts with the same basic syntax. Every ggplot starts with a call to the ggplot() function along with an argument specifying the data set to be used and aesthetic mappings from variables in the data set to visual properties of the plot, such as x and y position.
# install.packages("tidyverse")
library(tidyverse)
We are not going to spend much time learning about qplot() since learning the ggplot() syntax is at the heart of the package. Let’s look at one qplot for illustrative purposes and then move on.
library(ggplot2)
qplot(
x = carat, # x variable
y = price, # y variable
data = diamonds, # Data set
geom = "point", # Plot type
color = clarity, # Color points by variable clarity
xlab = "Carat Weight", # x label
ylab = "Price", # y label
main = "Diamond Carat vs. Price"
)
Warning: `qplot()` was deprecated in ggplot2 3.4.0.
# Title
ggplot(
dataset,
aes(Accession, molGC....)
) +
geom_point()
In the code above, we specify the data we want to work with and then assign the variables of interest, Accession and GC Content, to the x and y values of the plot. “aes()” is an aesthetics wrapper used in ggplot to map variables to visual properties. When you want a visual property to change based on the value of a variable, that specification belongs inside an aes() wrapper. If you are setting a fixed value that doesn’t change based a variable, it belongs outside of aes().
Note: Add a new element to a plot by putting a “+” after the preceding element.
The layers you add determine the type of plot you create. In this case, we used geom_point() which simply draws the data as points at the specified x and y coordinates, creating a scatterplot. ggplot2 has a wide range of geoms to create different types of plots. Here is a list of geoms for all the plot types we covered in the last lesson, plus a few more
geom_histogram() # histogram
geom_bar: na.rm = FALSE, orientation = NA
stat_bin: binwidth = NULL, bins = NULL, na.rm = FALSE, orientation = NA, pad = FALSE
position_stack
geom_density() # density plot
geom_density: na.rm = FALSE, orientation = NA, outline.type = upper
stat_density: na.rm = FALSE, orientation = NA
position_identity
geom_boxplot() # boxplot
geom_boxplot: outliers = TRUE, outlier.colour = NULL, outlier.fill = NULL, outlier.shape = 19, outlier.size = 1.5, outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA
stat_boxplot: na.rm = FALSE, orientation = NA
position_dodge2
geom_violin() # violin plot (combination of boxplot and density plot)
geom_violin: draw_quantiles = NULL, na.rm = FALSE, orientation = NA
stat_ydensity: trim = TRUE, scale = area, na.rm = FALSE, orientation = NA, bounds = c(-Inf, Inf)
position_dodge
geom_bar() # bar graph
geom_bar: just = 0.5, width = NULL, na.rm = FALSE, orientation = NA
stat_count: width = NULL, na.rm = FALSE, orientation = NA
position_stack
geom_point() # scatterplot
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
geom_jitter() # scatterplot with points randomly perturbed to reduce overlap
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_jitter
geom_line() # line graph
geom_line: na.rm = FALSE, orientation = NA
stat_identity: na.rm = FALSE
position_identity
geom_errorbar() # Add error bar
geom_errorbar: na.rm = FALSE, orientation = NA
stat_identity: na.rm = FALSE
position_identity
geom_smooth() # Add a best-fit line
geom_smooth: na.rm = FALSE, orientation = NA, se = TRUE
stat_smooth: na.rm = FALSE, orientation = NA, se = TRUE
position_identity
geom_abline() # Add a line with specified slope and intercept
mapping: intercept = ~intercept, slope = ~slope
geom_abline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
Notice the scatterplot we made above didn’t have a nice coloring. We can attribute the colors of the points to its Molecular type.
ggplot(dataset, aes(Accession, molGC...., colour = Molecule)) +
geom_point(alpha = 0.5)
ggplot(dataset, aes(Accession, molGC....)) +
geom_point(aes(color = Molecule), alpha = 0.2)
We pass alpha in as an argument outside of the aes() mapping because we are setting alpha to a fixed value instead of mapping it to a variable.
By setting alpha to 0.1, each data point has 90% transparency. At such high transparency, single data points are hard to see, but it lets us focus on high density areas.
dataset %>%
ggplot(aes(Genome.Length..bp., molGC....)) +
geom_point(aes(colour = Molecule), alpha = 0.5) +
labs(x = "Genome Length", y = "GC Content")
Geosmooth
Illustrating the use of geosmooth using a built in dataset in R.
In ggplot2, the geom_smooth() function is used to add a smooth line or curve to a plot. It is commonly used to visualize the trend or relationship between variables.
sample_DataSet <- CO2
sample_DataSet
Grouped Data: uptake ~ conc | Plant
Plant Type Treatment conc uptake
1 Qn1 Quebec nonchilled 95 16.0
2 Qn1 Quebec nonchilled 175 30.4
3 Qn1 Quebec nonchilled 250 34.8
4 Qn1 Quebec nonchilled 350 37.2
5 Qn1 Quebec nonchilled 500 35.3
6 Qn1 Quebec nonchilled 675 39.2
7 Qn1 Quebec nonchilled 1000 39.7
8 Qn2 Quebec nonchilled 95 13.6
9 Qn2 Quebec nonchilled 175 27.3
10 Qn2 Quebec nonchilled 250 37.1
11 Qn2 Quebec nonchilled 350 41.8
12 Qn2 Quebec nonchilled 500 40.6
13 Qn2 Quebec nonchilled 675 41.4
14 Qn2 Quebec nonchilled 1000 44.3
15 Qn3 Quebec nonchilled 95 16.2
16 Qn3 Quebec nonchilled 175 32.4
17 Qn3 Quebec nonchilled 250 40.3
18 Qn3 Quebec nonchilled 350 42.1
19 Qn3 Quebec nonchilled 500 42.9
20 Qn3 Quebec nonchilled 675 43.9
21 Qn3 Quebec nonchilled 1000 45.5
22 Qc1 Quebec chilled 95 14.2
23 Qc1 Quebec chilled 175 24.1
24 Qc1 Quebec chilled 250 30.3
25 Qc1 Quebec chilled 350 34.6
26 Qc1 Quebec chilled 500 32.5
27 Qc1 Quebec chilled 675 35.4
28 Qc1 Quebec chilled 1000 38.7
29 Qc2 Quebec chilled 95 9.3
30 Qc2 Quebec chilled 175 27.3
31 Qc2 Quebec chilled 250 35.0
32 Qc2 Quebec chilled 350 38.8
33 Qc2 Quebec chilled 500 38.6
34 Qc2 Quebec chilled 675 37.5
35 Qc2 Quebec chilled 1000 42.4
36 Qc3 Quebec chilled 95 15.1
37 Qc3 Quebec chilled 175 21.0
38 Qc3 Quebec chilled 250 38.1
39 Qc3 Quebec chilled 350 34.0
40 Qc3 Quebec chilled 500 38.9
41 Qc3 Quebec chilled 675 39.6
42 Qc3 Quebec chilled 1000 41.4
43 Mn1 Mississippi nonchilled 95 10.6
44 Mn1 Mississippi nonchilled 175 19.2
45 Mn1 Mississippi nonchilled 250 26.2
46 Mn1 Mississippi nonchilled 350 30.0
47 Mn1 Mississippi nonchilled 500 30.9
48 Mn1 Mississippi nonchilled 675 32.4
49 Mn1 Mississippi nonchilled 1000 35.5
50 Mn2 Mississippi nonchilled 95 12.0
51 Mn2 Mississippi nonchilled 175 22.0
52 Mn2 Mississippi nonchilled 250 30.6
53 Mn2 Mississippi nonchilled 350 31.8
54 Mn2 Mississippi nonchilled 500 32.4
55 Mn2 Mississippi nonchilled 675 31.1
56 Mn2 Mississippi nonchilled 1000 31.5
57 Mn3 Mississippi nonchilled 95 11.3
58 Mn3 Mississippi nonchilled 175 19.4
59 Mn3 Mississippi nonchilled 250 25.8
60 Mn3 Mississippi nonchilled 350 27.9
61 Mn3 Mississippi nonchilled 500 28.5
62 Mn3 Mississippi nonchilled 675 28.1
63 Mn3 Mississippi nonchilled 1000 27.8
64 Mc1 Mississippi chilled 95 10.5
65 Mc1 Mississippi chilled 175 14.9
66 Mc1 Mississippi chilled 250 18.1
67 Mc1 Mississippi chilled 350 18.9
68 Mc1 Mississippi chilled 500 19.5
69 Mc1 Mississippi chilled 675 22.2
70 Mc1 Mississippi chilled 1000 21.9
71 Mc2 Mississippi chilled 95 7.7
72 Mc2 Mississippi chilled 175 11.4
73 Mc2 Mississippi chilled 250 12.3
74 Mc2 Mississippi chilled 350 13.0
75 Mc2 Mississippi chilled 500 12.5
76 Mc2 Mississippi chilled 675 13.7
77 Mc2 Mississippi chilled 1000 14.4
78 Mc3 Mississippi chilled 95 10.6
79 Mc3 Mississippi chilled 175 18.0
80 Mc3 Mississippi chilled 250 17.9
81 Mc3 Mississippi chilled 350 17.9
82 Mc3 Mississippi chilled 500 17.9
83 Mc3 Mississippi chilled 675 18.9
84 Mc3 Mississippi chilled 1000 19.9
ggplot(sample_DataSet, aes(conc, uptake, colour = Treatment)) +
geom_point(size = 3, alpha = 0.5) +
geom_smooth(method = lm, se = F)
If you want to classify it further into types based on your data, use facet_wrap().
ggplot(sample_DataSet, aes(conc, uptake, colour = Treatment)) +
geom_point(size = 3, alpha = 0.5) +
geom_smooth(method = lm, se = F) +
facet_wrap(~Type) +
labs(title = "Concentration of CO2") +
theme_bw()
Now that we know the basics of creating plots with ggplot(), let’s remake some of the plots we created last time and see how they look in ggplot2, starting with a histogram.
ggplot(data = dataset, aes(x = molGC....)) +
geom_histogram(
fill = "skyblue",
col = "black"
) +
labs(x = "GC Content")
ggplot(dataset, aes(Molecule, Genome.Length..bp.)) +
geom_jitter(
alpha = 0.05, # Add jittered data points
color = "yellow"
) +
geom_boxplot() +
labs(
title = "Genome Length of Different Molecular Types",
y = "Genome Length"
)
ggplot(data = dataset, aes(x = molGC....)) +
geom_density(
position = "stack", # Create a stacked density chart
aes(fill = Molecule), # Fill based on cut
alpha = 0.5
) # Set transparency
One of the most powerful aspects of plots is the ability to visually illustrate relationships between 3 or more variables. When we create a plot, each different dimension (variable) needs to map to a different perceptual feature (aesthetic) such as x position, y position, symbol, size or color. Making use of several of these aesthetics at once lets us make plots involving many dimensions. We’ve already seen some examples of multidimensional plots, such as the first scatterplot in this lesson that displayed carat weight and price colored by clarity.
Faceting is another way to add an extra dimension to a plot. Faceting breaks a plot up based on a factor variable and draws a different plot for each level of the factor. You can create a faceted plot in ggplot2 by adding a facet_wrap() layer.
ggplot(data = diamonds, aes(x = carat, y = price)) + # Initialize plot
geom_point(aes(color = color), # Color based on diamond color
alpha = 0.5
) +
facet_wrap(~clarity) + # Facet on clarity
geom_smooth() + # Add an estimated fit line*
theme(legend.position = c(0.85, 0.16)) # Set legend position
Scales are parameters in ggplot2 that determine how a plot maps values to visual properties (aesthetics.). If you don’t specify a scale for an aesthetic the plot will use a default scale. For instance, the plots we split on color all used a default color scale. You can specify custom scales by adding scale elements to your plot. Scale elements have the following structure:
scale_aesthetic_scaletype()
We already saw an example of a scale when made the grouped barplot above. In that case we wanted to manually set the fill color scale for the bars, so the scale we used was:
scale_fill_manual()
Let’s make a new scatterplot with several aesthetic properties and alter some of the scales.
ggplot(data = diamonds, aes(x = carat, y = price)) + # Initialize plot
geom_point(aes(
size = carat, # Size points based on carat
color = color, # Color based on diamond color
alpha = clarity
)) + # Set transparency based on clarity
scale_color_manual(values = c(
"#FFFFFF", "#F5FCC2", # Use manual color values
"#E0ED87", "#CCDE57",
"#B3C732", "#94A813",
"#718200"
)) +
scale_alpha_manual(values = c(
0.1, 0.15, 0.2, # Use manual alpha values
0.3, 0.4, 0.6,
0.8, 1
)) +
scale_size_identity() + # Set size values to the actual values of carat*
xlim(0, 2.5) + # Limit x-axis
theme(panel.background = element_rect(fill = "#7FB2B8")) + # Change background color
theme(legend.key = element_rect(fill = "#7FB2B8")) # Change legend background color
De La Salle University, Manila, Philippines, daphne_janelyn_go@dlsu.edu.ph↩︎